I liked the Financial Times plots for tracking the evolution of COVID-19 (https://www.ft.com/coronavirus-latest), but then they changed to different plots. So here I am more-or-less reproducing those plots (and adding some others). This is generated from an Rmarkdown document that I’ll be rerendering daily.
I’d like to find an easy source of online data that breaks down the US by state, but for now just use the country data from ourworldindata.org:
cases = read.csv("https://covid.ourworldindata.org/data/ecdc/total_cases.csv",
stringsAsFactors=FALSE)
cases$date = as.POSIXct(cases$date)
cases$doy = round(as.numeric(difftime(cases$date, as.POSIXct("2019-12-31"), units="days")))
deaths = read.csv("https://covid.ourworldindata.org/data/ecdc/total_deaths.csv",
stringsAsFactors=FALSE)
deaths$date = as.POSIXct(deaths$date)
deaths$doy = round(as.numeric(difftime(deaths$date, as.POSIXct("2019-12-31"), units="days")))
Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)
Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)
By “naive” I mean: at any point in time, divide the total cumulative number of deaths by the total cumulative number of confirmed cases. This will be biased high because the denominator is too small because not all cases are detected (due to lack of testing), but on the other had will be biased low because some active cases will result in deaths eventually. Should eventually converge on the true CFR if testing becomes widespread.
The raw daily data is quite noisy, so there are also smoothed versions.
So far what I’ve found is from the NY Times. It is pretty strangely structured, but oh well:
states = read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv",
stringsAsFactors=FALSE)
states$date = as.POSIXct(states$date)
states$doy = round(as.numeric(difftime(states$date, as.POSIXct("2019-12-31"), units="days")))
# Get this into a more sensible structure:
us.deaths = t(tapply(states$deaths,
list(factor(states$state, levels=sort(unique(states$state))),
factor(states$doy, levels=min(states$doy):max(states$doy))),
identity))
us.cases = t(tapply(states$cases,
list(factor(states$state, levels=sort(unique(states$state))),
factor(states$doy, levels=min(states$doy):max(states$doy))),
identity))
Still deciding what exactly to plot for the states…